Motivating Personality-aware Machine Translation
نویسندگان
چکیده
Language use is known to be influenced by personality traits as well as by sociodemographic characteristics such as age or mother tongue. As a result, it is possible to automatically identify these traits of the author from her texts. It has recently been shown that knowledge of such dimensions can improve performance in NLP tasks such as topic and sentiment modeling. We posit that machine translation is another application that should be personalized. In order to motivate this, we explore whether translation preserves demographic and psychometric traits. We show that, largely, both translation of the source training data into the target language, and the target test data into the source language has a detrimental effect on the accuracy of predicting author traits. We argue that this supports the need for personal and personality-aware machine translation models.
منابع مشابه
Gender Aware Spoken Language Translation Applied to English-Arabic
Spoken Language Translation (SLT) is becoming more widely used and becoming a communication tool that helps in crossing language barriers. One of the challenges of SLT is the translation from a language without gender agreement to a language with gender agreement such as English to Arabic. In this paper, we introduce an approach to tackle such limitation by enabling a Neural Machine Translation...
متن کاملAlgorithms for Syntax-Aware Statistical Machine Translation
All of the non-trivial algorithms that are necessary for building and applying a rudimentary syntax-aware statistical machine translation system are generalized parsers. This paper extends the “translation by parsing” architecture by adding two components that are invariably used by state-of-the-art statistical machine translation systems. First, the paper shows how a generic syntax-aware trans...
متن کاملImages as Context in Statistical Machine Translation∗
This paper reports ongoing experiments towards exploiting the use of images to provide additional context for statistical machine translation (SMT). We investigate whether this contextual information can be helpful in targeting two well-known challenges in machine translation: ambiguity (incorrect translation of words that have multiple senses) and out-of-vocabulary words (words left untranslat...
متن کاملA Character-Aware Encoder for Neural Machine Translation
This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences of characters rather than words. On the use of row convolution (Amodei et al., 2015), the encoder of the proposed model composes word-level information from the input sequences of characters automatically. Since our model doesn’t rely on the boundaries between each wo...
متن کاملName-aware Machine Translation
We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora, extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding. Additionally, we also propose a new MT metric to appropriately evaluate the translation quality of informative words, by ass...
متن کامل